On Avoiding Sufficiently Long Abelian Squares 



Elyot Grant 
December 3, 2010 



Abstract 

A finite word w is an abelian square ii w = xx' with x' a permutation of x. In 1972, 
Entringer, Jackson, and Schatz proved that every binary word of length k'^+Qk contains 
an abefian square of length > 2k. We use Cartesian lattice paths to characterize abelian 
squares in binary sequences, and construct a binary word of length q{q + 1) avoiding 
abelian squares of length > 2^j2q[(i + 1) or greater. We thus prove that the length of 
the longest binary word avoiding abelian squares of length 2k is 



1 Introduction 



Let S be a finite alphabet. A word ti; G S* is an abelian square of order k if w = xx' with 
|a;| = \x'\ = k and x' a permutation of x. In 1972, Entringer, Jackson, and Schatz proved 
that all infinite binary sequences contain arbitrarily large abelian squares pQ. In particular, 
they showed that all binary words w G {0, 1}* of length k'^ + 6k contain an abelian square 
of order k or greater. In this paper, we examine £{k), the length of the longest binary word 
avoiding abelian squares xx' with \x\ > k. 

Precise values of i{k) have been computed for 1 < A; < 10 by Jeffrey Shallit and Narad 
Rampersad via a brute force search. The results are given in Section [2l 

The bound £{k) < k"^ + 6k given by Entringer, Jackson, and Schatz is not the best 
possible upper bound, but an improved upper bound remains unknown. A simple lower 
bound £{k) > 8k — 6 can be obtained by observing that the string Q^'^'^l^'^'^O^'^'^l^'^"^ 
contains no abelian squares of order k or greater. This lower bound is tight for 2 < k < 7, 
but is suboptimal for k > 8. 

In this paper, we give a quadratic lower bound for i{k), proving that i{k) is 0(fc^). 
Moreover, we provide an intuitive geometric characterization of abelian squares in a binary 
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word by treating each character of a string as a step of a lattice path in the Cartesian plane. 
We use this geometric notion to construct, for all g, a word of length q(q + 1) containing no 
abelian squares of order > -^2g(g + 1). 

Many thanks go to Jeffrey Shallit for suggesting this as a problem to study as part of 

CS 860: Patterns in Strings: Existence, Avoidahility, Enumeration, a course he developed 
and taught at the University of Waterloo. 



2 Values of l{k) for 1 < /c < 10 

Jeffrey Shallit and Narad Rampersad have provided the values of for 1 < A; < 10. 
We give them here, alongside the lexicographically least word of length l{k) containing no 
abelian squares of order k or greater: 



k 


m 




1 


3 


010 


2 


10 


0011100011 


3 


18 


000011111000001111 


4 


26 


000000 111111 10000000 111111 


5 


34 


0000000011111111100000000011111111 


6 


42 


0000000000 11 11 11 1 11 110000000000011 11 11 11 11 


7 


50 


00000000000001000001100001111001111101111111111111 


8 


62 


00000000000000010000100100011001100111011011110111111111111111 


9 


76 


00000000000000000100000001100100001110100011110110011111110111 

11111111111111 


10 


90 


00000000000000000001000000100100000110101000011110101001111101 

1011111101111111111111111111 


11 


> 106 




12 


> 124 




13 


> 139 





3 Main Result 

Given a word w[l..t] G {0, 1}*, let Si = Yl]=i''^[j] be the nondecreasing sequence of prefix 
sums of w. By plotting the ordered pairs {i, Si) for < i < t, we obtain a representation of 
w as a path across the Cartesian lattice, stepping east when w contains a zero, and northeast 
when w contains a 1. An example for the string 100110001 is shown in Figure 1. 
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Figure 1: Lattice path for 100110001 




We note that the number of ones in w[m..n] is Sn~ Sm-i- Consequently, w[i + 1 . A + 2r] is 
an abehan square iff SiJ^r — Si = Si^2r — Si-^^r, which occurs precisely when {i, Si), {i + r, Si+r), 
and (i + 2r, Si+2r) are three equally spaced coUinear points in our lattice path. In Figure 1, 
the three circled points indicate the presence of the subword 001100, an abelian square. 

Next, we give our construction of a word of length q{q+l) containing no abelian squares of 
order > \/2q(q + 1). We design our word w so that its lattice path approximates a quadratic 
function; this ensures that three equally spaced points along the path can be coUinear only 
if they are sufficiently close together. For < i < q{q + 1), define 



.2g(g + l). 



We note that if i < q{q +1), then 
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2i — 1 < 2q{q + 1), and hence 



— aj_i G {0, 1} for all 1 < i < q{q+ 1) . We can thus define a binary word w = w[l..q{q + l)] 
by w[i] = tti — aj_i. We will show the following: 



Theorem 1. w contains no abehan squares xx' with \x\ > \/2q{q + 1). 



Our theorem implies that if q is an integer with 2q{q + 1) < A;^, then there exists a binary 
word of length q{q + 1) containing no abelian squares of order k. For a given /c, the shortest 



such q is 



Vl+2fc^-l 



Consequently, we may conclude the following: 



Corollary 2. t{k) > 



VT+W-1 



VT+W-i 



+ 11 > y - 
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4 Proof of Theorem [T] 



Suppose w contains an abelian square xx' with \x\ = r. Then there exist two adjacent blocks 
w[i + l.i + r] and w[i + r + + 2r] such that + l.i + r] |i = \w[i + r + + 2r]\i. This 
imphes that a^+r — ai = aj+2r — cti+r- We ehminate the floor function to bound the various 
tti values above and below in the following manner: 



- 1 < tti 



Mq + 1) 

{i + rf 



2q{q + l) 

[i + 2rf 



2q{q + 1) 



1 < aj+2r 



Taking a linear combination of the above inequalities, we obtain 

f {i + 2rf {i + r)2 

1 + 2aj+r + 7T — 1 < Oj + 2— — + ai+2r 



2g(g+l) 2g(g + l) 2g(g + 1) 

and we may cancel the terms since Oj+r — CLi = ai+2r — ^j+r- We simplify what remains to 
obtain our result: 

+ ^ — ^ - 2 < 2-^ ^ 



2g(g + l) 2q{q + l) 2q{q + l 

+ + 2rf - Aq{q + 1) < 2{i + rf 

< 2q{q + l) 



5 Additional Remarks 



One might suggest that we could improve our lower bound slightly by computing more Oj 



values and extending w to a. longer string. Indeed, we can take 



_2q{q+l)_ 

we reach an n such that a„+i — > 1. Unfortunately, it turns out that this doesn't help us 



for all i until 
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much. Taking p — q(q + 1) + \/2q{q + 1) , we see that 



2?(g + l) 



L 2g(? + i) J 



\/2g(9 + 1) 



>p-g(g + l) + l. 



V2g(g + i) ) 



2g(g + l) 



Consequently, there must be some n with q{q +1) < n < p such that a„+i — a„ > 1. Thus 
we can extend w for at most another ■\/2q{q + 1) symbols. 
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